Python is an alternative scripting language that has become very popular among data analysts. In contrast to R, Python is a general scripting language that has had some interesting statistical and visualization packages developed for it, namely numpy (and the related scipy) and matplotlib. Although matplotlib is much closer to R's base graphics, it does provide a general visualization framework for the Python scripting language. Note that the syntax is very similar to plotting in MatLab.
In [1]:
    
%pylab inline
# If you are using a new version of ipython, change this to:
# %matplotlib inline
    
    
In [2]:
    
import matplotlib.pyplot as plt
    
In [3]:
    
x = linspace(0, 5, 10)
y = x ** 2
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 1, 1]) # add an axes at the particular location, with the specified height [left, bottom, width, height]
axes.plot(x, y, 'g')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title')
    
    Out[3]:
    
Here we have defined the xvalues as a series of 10 points from 0 ... 5, and yvalues as the square of x.
In [4]:
    
import pandas as pd
    
In [5]:
    
metaDF = pd.read_csv("metabolomics_reshapedData.csv")
    
In [6]:
    
freshTom = metaDF['value'][metaDF['treat'] == "fresh"][metaDF['Species'] == 'tomatillo'][metaDF['value'] <= 5000]
freshPum = metaDF['value'][metaDF['treat'] == "fresh"][metaDF['Species'] == 'pumpkin'][metaDF['value'] <= 5000]
    
In [7]:
    
nBin = 100
plt.hist(freshTom, bins=nBin)
    
    Out[7]:
    
In [8]:
    
plt.hist(freshPum, bins=nBin)
    
    Out[8]:
    
In [9]:
    
lyophTom = metaDF['value'][metaDF['treat'] == "lyoph"][metaDF['Species'] == 'tomatillo'][metaDF['value'] <= 5000]
lyophPum = metaDF['value'][metaDF['treat'] == "lyoph"][metaDF['Species'] == 'pumpkin'][metaDF['value'] <= 5000]
    
In [10]:
    
plt.hist(lyophTom, bins=nBin)
    
    Out[10]:
    
In [11]:
    
plt.hist(lyophPum, bins=nBin)
    
    Out[11]:
    
That's great, but is there a way we can have all 4 together?
In [12]:
    
fig, axes = plt.subplots(nrows=2, ncols=2, sharey=True, sharex=True, squeeze=True)
axes[0,0].hist(lyophPum, bins=nBin)
axes[0,0].set_title("Lyoph Pumpkin")
axes[0,1].hist(lyophTom, bins=nBin)
axes[0,1].set_title("Lyoph Tomatillo")
axes[1,0].hist(freshPum, bins=nBin)
axes[1,0].set_title("Fresh Pumpkin")
axes[1,1].hist(freshTom, bins=nBin)
axes[1,1].set_title("Fresh Tomatillo")
fig.tight_layout()
    
    
Can we do any overlap of the histograms as we did in R with ggplot?
In [14]:
    
bins = linspace(0, 5000, 20)  # set up the bins in advance
plt.hist(lyophPum, bins, alpha=0.5, label='Lyoph Pumpkin')
plt.hist(lyophTom, bins, alpha=0.5, label='Lyoph Tomatillo')
plt.legend()
    
    Out[14]:
    
There is a python port of ggplot that is available, however, it appears to still be very much beta software.
In [15]:
    
from ggplot import *
    
In [16]:
    
ggplot(diamonds, aes('carat', 'price')) + geom_point(alpha=1/20.) + ylim(0, 20000)
    
    
    Out[16]:
In [17]:
    
ggplot(diamonds, aes(x='price', color='cut')) + geom_density()
    
    
    
In [37]:
    
x = linspace(0, 5, 10)
y = x ** 2
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 1, 1]) # add an axes at the particular location, with the specified height [left, bottom, width, height]
axes.plot(x, y, 'g')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');
fig.savefig("testFig.png")
    
    
In [38]:
    
x = linspace(0, 5, 10)
y = x ** 2
fig = plt.figure(dpi=400)
axes = fig.add_axes([0.1, 0.1, 1, 1]) # add an axes at the particular location, with the specified height [left, bottom, width, height]
axes.plot(x, y, 'g')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');
fig.savefig("hiRes.png")